{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f76d45d2",
   "metadata": {},
   "source": [
    "### Extract Structured Data From Text: Expert Mode (Using Function Calling)\n",
    "\n",
    "We are going to explore [OpenAI's Function Calling](https://openai.com/blog/function-calling-and-other-api-updates) for extracting structured data from unstructured sources.\n",
    "\n",
    "**Why is this important?**\n",
    "LLMs are great at text output, but they need extra help outputing information in a structure that we want. A common request from developers is to get JSON data back from our LLMs.\n",
    "\n",
    "Spoiler: Jump down to the bottom to see a bonefied business idea that you can start and manage today."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7764b585",
   "metadata": {},
   "outputs": [],
   "source": [
    "# LangChain Models\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.llms import OpenAI\n",
    "from langchain.schema import HumanMessage, SystemMessage, AIMessage\n",
    "\n",
    "# Standard Helpers\n",
    "import pandas as pd\n",
    "import requests\n",
    "import time\n",
    "import json\n",
    "from datetime import datetime\n",
    "import os\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv()\n",
    "\n",
    "# Text Helpers\n",
    "from bs4 import BeautifulSoup\n",
    "from markdownify import markdownify as md\n",
    "\n",
    "# For token counting\n",
    "from langchain.callbacks import get_openai_callback\n",
    "\n",
    "def printOutput(output):\n",
    "    print(json.dumps(output,sort_keys=True, indent=3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "e1728e00",
   "metadata": {},
   "outputs": [],
   "source": [
    "# It's better to do this an environment variable but putting it in plain text for clarity\n",
    "openai_api_key = os.getenv(\"OPENAI_API_KEY\", 'YourAPIKey')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29c41e97",
   "metadata": {},
   "source": [
    "Let's start off by creating our LLM. We're using gpt4 to take advantage of its increased ability to follow instructions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "f0060ded",
   "metadata": {},
   "outputs": [],
   "source": [
    "chat = ChatOpenAI(\n",
    "    model_name=\"gpt-3.5-turbo-0613\", # Cheaper but less reliable\n",
    "    temperature=0,\n",
    "    max_tokens=2000,\n",
    "    openai_api_key=openai_api_key\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b0786a3",
   "metadata": {},
   "source": [
    "### Function Calling Hello World Example\n",
    "\n",
    "Create an object that holds information about the fields you'd like to extract"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "9d7d6783",
   "metadata": {},
   "outputs": [],
   "source": [
    "functions = [\n",
    "    {\n",
    "        \"name\": \"get_food_mentioned\",\n",
    "        \"description\": \"Get the food that is mentioned in the review from the customer\",\n",
    "        \"parameters\": {\n",
    "            \"type\": \"object\",\n",
    "            \"properties\": {\n",
    "                \"food\": {\n",
    "                    \"type\": \"string\",\n",
    "                    \"description\": \"The type of food mentioned, ex: Ice cream\"\n",
    "                },\n",
    "                \"good_or_bad\": {\n",
    "                    \"type\": \"string\",\n",
    "                    \"description\": \"whether or not the user thought the food was good or bad\",\n",
    "                    \"enum\": [\"good\", \"bad\"]\n",
    "                }\n",
    "            },\n",
    "            \"required\": [\"location\"]\n",
    "        }\n",
    "    }\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d120776d",
   "metadata": {},
   "outputs": [],
   "source": [
    "output = chat(messages=\n",
    "     [\n",
    "         SystemMessage(content=\"You are an helpful AI bot\"),\n",
    "         HumanMessage(content=\"I thought the burgers were awesome\")\n",
    "     ],\n",
    "     functions=functions\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "496d7d5b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "    \"function_call\": {\n",
      "        \"name\": \"get_food_mentioned\",\n",
      "        \"arguments\": \"{\\n  \\\"food\\\": \\\"burgers\\\",\\n  \\\"good_or_bad\\\": \\\"good\\\"\\n}\"\n",
      "    }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "print(json.dumps(output.additional_kwargs, indent=4))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff82bc96",
   "metadata": {},
   "source": [
    "### Pydantic Model\n",
    "\n",
    "Now let's do the same thing but with a pydantic model rather than json schema"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "966adb72",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.pydantic_v1 import BaseModel, Field\n",
    "import enum\n",
    "\n",
    "class GoodOrBad(str, enum.Enum):\n",
    "    GOOD = \"Good\"\n",
    "    BAD = \"Bad\"\n",
    "\n",
    "class Food(BaseModel):\n",
    "    \"\"\"Identifying information about a person's food review.\"\"\"\n",
    "\n",
    "    name: str = Field(..., description=\"Name of the food mentioned\")\n",
    "    good_or_bad: GoodOrBad = Field(..., description=\"Whether or not the user thought the food was good or bad\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "323f6347",
   "metadata": {},
   "outputs": [],
   "source": [
    "output = chat(messages=\n",
    "     [\n",
    "         SystemMessage(content=\"You are an helpful AI bot\"),\n",
    "         HumanMessage(content=\"I thought the burgers were awesome\")\n",
    "     ],\n",
    "     functions=[{\n",
    "         \"name\": \"FoodExtractor\",\n",
    "         \"description\": (\n",
    "             \"Identifying information about a person's food review.\"\n",
    "         ),\n",
    "         \"parameters\": Food.schema(),\n",
    "        }\n",
    "     ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "1944735a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content='', additional_kwargs={'function_call': {'name': 'FoodExtractor', 'arguments': '{\\n  \"name\": \"burgers\",\\n  \"good_or_bad\": \"Good\"\\n}'}})"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95de736a",
   "metadata": {},
   "source": [
    "But LangChain has an abstraction for us that we can use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "954619a5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Food(name='burgers', good_or_bad=<GoodOrBad.GOOD: 'Good'>)]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain.chains import create_extraction_chain_pydantic\n",
    "\n",
    "# Extraction\n",
    "chain = create_extraction_chain_pydantic(pydantic_schema=Food, llm=chat)\n",
    "\n",
    "# Run \n",
    "text = \"\"\"I like burgers they are great\"\"\"\n",
    "chain.run(text)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbe68a3d",
   "metadata": {},
   "source": [
    "### Multiple Results\n",
    "\n",
    "Let's try to extract multiple objects from the same text. I'll create a person object now"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "906c0f05",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Sequence\n",
    "\n",
    "chat = ChatOpenAI(\n",
    "    model_name=\"gpt-4-0613\", # Cheaper but less reliable\n",
    "    temperature=0,\n",
    "    max_tokens=2000,\n",
    "    openai_api_key=openai_api_key\n",
    ")\n",
    "\n",
    "class Person(BaseModel):\n",
    "    \"\"\"Someone who gives their review on different foods\"\"\"\n",
    "\n",
    "    name: str = Field(..., description=\"Name of the person\")\n",
    "    foods: Sequence[Food] = Field(..., description=\"A food that a person mentioned\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "0a5a1f28",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extraction\n",
    "chain = create_extraction_chain_pydantic(pydantic_schema=Person, llm=chat)\n",
    "\n",
    "# Run \n",
    "text = \"\"\"amy likes burgers and fries but doesn't like salads\"\"\"\n",
    "output = chain.run(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "24d7c624",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Person(name='amy', foods=[Food(name='burgers', good_or_bad=<GoodOrBad.GOOD: 'Good'>), Food(name='fries', good_or_bad=<GoodOrBad.GOOD: 'Good'>), Food(name='salads', good_or_bad=<GoodOrBad.BAD: 'Bad'>)])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "52eb267c",
   "metadata": {},
   "source": [
    "**User Query Extraction**\n",
    "\n",
    "Let's do another fun example where we want to extract/convert a query from a user"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "e98cdd1a",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Query(BaseModel):\n",
    "    \"\"\"Extract the change a user would like to make to a financial forecast\"\"\"\n",
    "\n",
    "    entity: str = Field(..., description=\"Name of the category or account a person would like to change\")\n",
    "    amount: int = Field(..., description=\"Amount they would like to change it by\")\n",
    "    year: int = Field(..., description=\"The year they would like the change to\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "02f45b66",
   "metadata": {},
   "outputs": [],
   "source": [
    "chain = create_extraction_chain_pydantic(pydantic_schema=Query, llm=chat)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "0a42a784",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Query(entity='inventory', amount=10, year=2022)]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.run(\"Can you please add 10 more units to inventory in 2022?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "00a15023",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Query(entity='revenue', amount=-3, year=2021)]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.run(\"Remove 3 million from revenue in 2021\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75568586",
   "metadata": {},
   "source": [
    "## Opening Attributes - Real World Example\n",
    "\n",
    "[Opening Attributes](https://twitter.com/GregKamradt/status/1643027796850253824) (my sample project for this application)\n",
    "\n",
    "If anyone wants to strategize on this project DM me on twitter"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4932a241",
   "metadata": {},
   "source": [
    "We are going to be pulling jobs from Greenhouse. No API key is needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "3cb6525f",
   "metadata": {},
   "outputs": [],
   "source": [
    "def pull_from_greenhouse(board_token):\n",
    "    # If doing this in production, make sure you do retries and backoffs\n",
    "    \n",
    "    # Get your URL ready to accept a parameter\n",
    "    url = f'https://boards-api.greenhouse.io/v1/boards/{board_token}/jobs?content=true'\n",
    "    \n",
    "    try:\n",
    "        response = requests.get(url)\n",
    "    except:\n",
    "        # In case it doesn't work\n",
    "        print (\"Whoops, error\")\n",
    "        return\n",
    "        \n",
    "    status_code = response.status_code\n",
    "    \n",
    "    jobs = response.json()['jobs']\n",
    "    \n",
    "    print (f\"{board_token}: {status_code}, Found {len(jobs)} jobs\")\n",
    "    \n",
    "    return jobs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "636e4e8b",
   "metadata": {},
   "source": [
    "Let's try it out for [Okta](https://www.okta.com/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "d2b794b9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "okta: 200, Found 155 jobs\n"
     ]
    }
   ],
   "source": [
    "jobs = pull_from_greenhouse(\"okta\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10c8651d",
   "metadata": {},
   "source": [
    "Let's look at a sample job with it's raw dictionary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "1a458cb9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Keep in mind that my job_ids will likely change when you run this depending on the postings of the company\n",
    "job_index = 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "b2b394e3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Preview:\n",
      "\n",
      "{\"absolute_url\": \"https://www.okta.com/company/careers/opportunity/5299914?gh_jid=5299914\", \"data_compliance\": [{\"type\": \"gdpr\", \"requires_consent\": false, \"requires_processing_consent\": false, \"requires_retention_consent\": false, \"retention_period\": null}], \"internal_job_id\": 2652958, \"location\": {\"name\": \"Bengaluru, India\"}, \"metadata\": null, \"id\": 5299914, \"updated_at\": \"2023-09-26T12:28:23-04:\n"
     ]
    }
   ],
   "source": [
    "print (\"Preview:\\n\")\n",
    "print (json.dumps(jobs[job_index])[:400])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5662c3cb",
   "metadata": {},
   "source": [
    "Let's clean this up a bit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "ce0ee96a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# I parsed through an output to create the function below\n",
    "def describeJob(job_description):\n",
    "    print(f\"Job ID: {job_description['id']}\")\n",
    "    print(f\"Link: {job_description['absolute_url']}\")\n",
    "    print(f\"Updated At: {datetime.fromisoformat(job_description['updated_at']).strftime('%B %-d, %Y')}\")\n",
    "    print(f\"Title: {job_description['title']}\\n\")\n",
    "    print(f\"Content:\\n{job_description['content'][:550]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "72759972",
   "metadata": {},
   "source": [
    "We'll look at another job. This job_id may or may not work for you depending on if the position is still active."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "12a43cea",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Job ID: 5299914\n",
      "Link: https://www.okta.com/company/careers/opportunity/5299914?gh_jid=5299914\n",
      "Updated At: September 26, 2023\n",
      "Title: Accounts Receivable Manager, AWS Marketplace (night shift)\n",
      "\n",
      "Content:\n",
      "&lt;div class=&quot;content-intro&quot;&gt;&lt;p&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;strong&gt;Get to know Okta&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;\n",
      "&lt;p&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;br&gt;&lt;/span&gt;Okta is The World’s Identity Company. We free everyone to safely use any technology—anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at t\n"
     ]
    }
   ],
   "source": [
    "# Note: I'm using a hard coded job id below. You'll need to switch this if this job ever changes\n",
    "# and it most definitely will!\n",
    "job_id = 5299914\n",
    "\n",
    "job_description = [item for item in jobs if item['id'] == job_id][0]\n",
    "    \n",
    "describeJob(job_description)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e2a3a009",
   "metadata": {},
   "source": [
    "I want to convert the html to text, we'll use BeautifulSoup to do this. There are multiple methods you could choose from. Pick what's best for you."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "bd3e15c4",
   "metadata": {},
   "outputs": [],
   "source": [
    "soup = BeautifulSoup(job_description['content'], 'html.parser')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "5a991fa4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "**Get to know Okta**\n",
      "\n",
      "\n",
      "  \n",
      "Okta is The World’s Identity Company. We free everyone to safely use any technology—anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at the heart of business security and growth.   \n",
      "  \n",
      "At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experie\n"
     ]
    }
   ],
   "source": [
    "text = soup.get_text()\n",
    "\n",
    "# Convert your html to markdown. This reduces tokens and noise\n",
    "text = md(text)\n",
    "\n",
    "print (text[:600])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f673bc2",
   "metadata": {},
   "source": [
    "Let's create a Kor object that will look for tools. This is the meat and potatoes of the application"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "ee906edf",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Tool(BaseModel):\n",
    "    \"\"\"The name of a tool or company\"\"\"\n",
    "\n",
    "    name: str = Field(..., description=\"Name of the food mentioned\")\n",
    "        \n",
    "class Tools(BaseModel):\n",
    "    \"\"\"A tool, application, or other company that is listed in a job description.\"\"\"\n",
    "\n",
    "    tools: Sequence[Tool] = Field(..., description=\"\"\" A tool or technology listed\n",
    "        Examples:\n",
    "        * \"Experience in working with Netsuite, or Looker a plus.\" > NetSuite, Looker\n",
    "        * \"Experience with Microsoft Excel\" > Microsoft Excel\n",
    "    \"\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "fd0074a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "chain = create_extraction_chain_pydantic(pydantic_schema=Tools, llm=chat)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "f8015920",
   "metadata": {},
   "outputs": [],
   "source": [
    "output = chain(text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "decc90ed",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Tools(tools=[Tool(name='Okta'), Tool(name='AWS Marketplace'), Tool(name='Tackle.io'), Tool(name='Salesforce'), Tool(name='NetSuite'), Tool(name='SFDC'), Tool(name='Tackle')])]"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "output['text']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6855ab40",
   "metadata": {},
   "source": [
    "[OpenAI GPT4 Pricing](https://help.openai.com/en/articles/7127956-how-much-does-gpt-4-cost)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "4e67af85",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total Tokens: 1400\n",
      "Prompt Tokens: 1295\n",
      "Completion Tokens: 105\n",
      "Successful Requests: 1\n",
      "Total Cost (USD): $0.045149999999999996\n"
     ]
    }
   ],
   "source": [
    "with get_openai_callback() as cb:\n",
    "    result = chain(text)\n",
    "    print(f\"Total Tokens: {cb.total_tokens}\")\n",
    "    print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
    "    print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
    "    print(f\"Successful Requests: {cb.successful_requests}\")\n",
    "    print(f\"Total Cost (USD): ${cb.total_cost}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b770a9e8",
   "metadata": {},
   "source": [
    "Suggested To Do if you want to build this out:\n",
    "\n",
    "* Reduce amount of HTML and low-signal text that gets put into the prompt\n",
    "* Gather list of 1000s of companies\n",
    "* Run through most jobs (You'll likely start to see duplicate information after the first 10-15 jobs per department)\n",
    "* Store results\n",
    "* Snapshot daily as you look for new jobs\n",
    "* Follow [Greg](https://twitter.com/GregKamradt) on Twitter for more tools or if you want to chat about this project\n",
    "* Read the user feedback below for what else to build out with this project (I reached out to everyone who signed up on twitter)\n",
    "\n",
    "\n",
    "### Business idea: Job Data As A Service\n",
    "\n",
    "Start a data service that collects information about company's jobs. This can be sold to investors looking for an edge.\n",
    "\n",
    "After posting [this tweet](https://twitter.com/GregKamradt/status/1643027796850253824) there were 80 people that signed up for the trial. I emailed all of them and most were job seekers looking for companies that used the tech they specialized in.\n",
    "\n",
    "The more interesting use case were sales teams + investors.\n",
    "\n",
    "#### Interesting User Feedback (Persona: Investor):\n",
    "\n",
    "> Hey Gregory, thanks for reaching out. <br><br>\n",
    "I always thought that job posts were a gold mine of information, and often suggest identifying targets based on these (go look at relevant job posts for companies that might want to work with you). Secondly, I also automatically ping BuiltWith from our CRM and send that to OpenAI and have a summarized tech stack created - so I see the benefit of having this as an investor. <br><br>\n",
    "For me personally, I like to get as much data as possible about a company. Would love to see job post cadence, type of jobs they post and when, notable keywords/phrases used, tech stack (which you have), and any other information we can glean from the job posts (sometimes they have the title of who you'll report to, etc.). <br><br>\n",
    "For sales people, I think finer searches, maybe even in natural language if possible - such as \"search for companies who posted a data science related job for the first time\" - would be powerful.\n",
    "\n",
    "If you do this, let me know! I'd love to hear how it goes."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
